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Abstract 

We summarize the Optimal Jet Definition and present the result of 
a benchmark Monte Carlo test based on the T^-boson mass extraction 
from fully hadronic decays of pairs of Ws. 

1 Introduction 

Jets of hadrons which appear in the final states of scattering experi- 
ments in high energy physics correspond, to the first approximation, 



*Talk given by E. Jankowski at Lake Louise Winter Institute: Particles and the Uni- 
verse, Lake Louise, Canada, February 16-22, 2003 



1 



to quarks and gluons produced in the collisions. Quarks and gluons, 
interacting strongly, are not observed as free particles. Only some 
combinations of them, hadrons, can avoid the strong interaction at 
large distances and only those combinations appear in experiments. 
If the energy of the colliding particles is high enough, the quarks and 
gluons produced in the collision manifest themselves as jets of hadrons 
which move roughly in the same direction as the quarks and gluons 
originating them. 

Let us consider an example high energy event. An electron and 
positron collide at the CM energy equal to 180 GeV. The electron and 
positron annihilate and a pair of W-bosons is produced. Each of the 
W's decays into two quarks. When the quarks move away from each 
other, potential energy of the strong interaction between them grows 
quickly and new pairs of quarks and antiquarks are created out of 
this energy. The many quarks and antiquarks combine into colorless 
hadrons which form 4 or more jets. 

We are interested, for instance, in extracting the W-boson mass 
from a collection of events similar to the one described above. It would 
be much easier if we were able to observe directly the quarks coming 
from decaying Ws. But we observe jets of hadrons instead and when 
we make the analysis we have to deal with the jets. And this may not 
be always easy. Jets may be wide and/or overlap. It is hard to say 
even how many jets we have and how to share the particles between 
them. 

Another aspect is that when we have a procedure to recognize and 
reconstruct jets it may give different answers for the same physical 
process depending whether it is applied at the level of quarks and 
gluons in theoretical calculations or at the level of hadrons from Monte 
Carlo simulations or at the level of calorimeter cells in experiments. 

The Optimal Jet Definition avoids most of the problems of the 
conventional schemes. The derivation of OJD from the properties of 
the strong interaction and specifics of measurements involving multi- 
hadronic final states is contained in [3]. A short introduction to 
the subject is 0. An efficient FORTRAN 77 implementation of OJD, 
called the Optimal Jet Finder (OJF), is described in [Sj and the source 
code is available from 6 . Below we summarize OJD and present the 
result of a benchmark Monte Carlo test based on the W-boson mass 
extraction from fully hadronic decays of pairs of W's. 
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2 Jet algorithms 



The analysis of events with many hadrons is often performed with 
the use of so called jet algorithms. A jet algorithm is a procedure to 
associate the particles into jets. It decides which particle belongs to 
which jet. Often it determines also how many jets there are. (When 
we say particles it may mean as well calorimeter cells or towers when 
the analysis is applied to experimental data or partons in theoretical 
calculations.) 

After the content of each jet is known, some rule is chosen to com- 
pute the properties of the jet from the properties of the particles that 
belong to that jet. A simple and logical prescription, but not necessar- 
ily the only possible (see for discussion), is that the 4-momentum 
of the jet, (jjet, is the sum of 4- momenta p a of all particles that belong 
to that jet: 9jet = £ thejet Pa- 

There have been many jet definitions developed by various collab- 
orations over the years. Examples are the class of cone algorithms 
(various variants) and the family of successive recombination algo- 
rithms such as &t (Durham), Jade, Geneva. 

Cone algorithms define a jet as all particles within a cone of fixed 
radius. The axis of the cone is found, for instance, from the require- 
ment that it coincides with the direction of the net 3-momentum of 
all particles within the cone. 

Successive recombination algorithms, in the simplest variant, work 
as follows. The "distance" d a b between any two particles is computed 
according to some definition, for example, d 2 b = E a Eb (1 — cos 9 a b) for 
JADE and d 2 b = min [E%, E 2 ) (1 — cos 6 a b) for hr, where E a is the en- 
ergy of the a-th particle and 9 a b is the angle between the a-th and the 
6-th particles. Then the pair with the smallest difference is merged 
into one pseudo-particle with the 4-momentum given (for example) 
by Pab = Pa + Pb- In that way the number of (pseudo-) particles is 
reduced by one. The procedure is repeated until the required number 
of pseudo-particles is left (if we know in advance how many jets we 
want) or until d a b > y C ut for all a, b, where y cu t is some chosen param- 
eter. The remaining pseudo-particles are the final jets. The described 
scheme corresponds to so called binary algorithms as they merge only 
two particles at a time (2 — ► 1). Other variants may correspond to 
3 — > 2 or more generally to m — ► n. 

With many available jet definitions, an obvious question is how to 
decide which algorithm should be used. It should be clear that the 
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jets are defined (through the jet algorithm used) for the purpose of 
data analysis. In the example used it is the W-boson mass extraction. 
In this case we can measure how good the jet definition is based on 
how small the uncertainty in the extracted mass is. On this idea we 
based our benchmark test of the Optimal Jet Definition. 

3 Optimal Jet Definition 

The OJD works as follows. It starts with a list of particles (hadrons, 
calorimeter cells, partons) and ends with a list of jets. To find the final 
jet configuration we define Qr, some function of a jet configuration. 
The momenta of the input particles enter Qr as parameters. The final, 
optimal jet configuration is found as the configuration that minimizes 

n R . 

The essential feature of this jet definition is that it takes into ac- 
count the global structure of the energy flow of the event. Above 
mentioned binary algorithms take at a time only two closest particles 
into account, to decide whether to merge them or not. 

A jet configuration is described by the so-called recombination 
matrix z a j, where a=l,2,...,N part indexes the input particles with 4- 
momenta p a and j=l,2,...,Aj cts indexes the jets. z a j is interpreted as 
the fraction of the a-th particle that goes into formation of the j-th jet. 
The conventional schemes correspond to restricting z a j to either one 
or zero depending on whether or not the a-th particle belongs to the 
j-th jet. Here we require only that < z a j < 1 and J2j z aj < 1- The 4- 
momentum of the j-th. jet is given by: qj = J2 a z ajPa- The 4-direction 
of the j-th jet is defined as qj = (l,qj), where = qj/|qj| is the 
unit direction vector obtained from qj = (Ej,qj). The explicit form 

of tt R is: Q R = J2j QjQj + J2 a ( 1 ~ Y^j z aj) E a- The first term in 
the above equation "measures" the width of the jets and the second 
is the fraction of the energy of the event that does not take part in 
any jet formation. The positive parameter R has the similar meaning 
to the radius parameter in cone algorithms in the sense that a smaller 
value of R results in narrower jets and more energy left outside jets. 
A large (> 2) value of R forces the energy left outside jets to zero. 

If the number of jets that the event should be reconstructed to is 
already known one finds z a j that minimizes Qr given in the above 
equation. This value of z a j describes the final desired configuration of 
jets. The minimization problem is non-trivial because of the large 
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dimension of the domain in which to search the global minimum, 
N pax % x iVjets = 0(100—1000) of continuous variables z a j. However, 
it is possible to solve it due to the known analytical structure of £Ir 
and the regular structure of the domain of z a j . An efficient implemen- 
tation, called the Optimal Jet Finder (OJF), is described in detail in 
[S| and the FORTRAN 77 code is available from The program 
starts with some initial value of z a j, which in the simplest case can be 
entirely random, and descends iteratively into the local minimum of 
In order to find the global minimum, random initial values of z a j 
are generated a couple of times (n tr ies) and the deepest minimum is 
chosen out of the local minima obtained at each try. 

If the number of jets should be determined in the process of jet 
finding, one repeats the above described reconstruction for the number 
of jets equal to 1,2,3,... and takes the smallest number of jets for which 
the minimum of is sufficiently small, i.e. £Ir < ui cn t, where uj cu t is 
a positive parameter chosen by the user. co CVL t has a similar meaning 
to the y cu t parameter in the successive recombination algorithms. 

The shapes of jets are determined dynamically in OJD (as opposed 
to the fixed shapes of cones in the cone algorithms). Jet overlaps are 
handled automatically without necessity of any arbitrary prescrip- 
tions. OJD is independent of whether input particles are split into 
collinear groups (collinear safety). OJD is also infrared safe, i.e. any 
soft particle radiation results in soft (small) only change in the struc- 
ture of jets. (So, it avoids the serious problems of cone algorithms 
based on seeds.) OJD, as opposed to successive recombination algo- 
rithms, takes into account the global structure of the energy flow in 
the event (rather then merging a single pair of particles at a time). 

4 Details of the test 

We performed a simple, benchmark Monte Carlo test of the Optimal 
Jet Definition. The analysis was modeled after a similar one performed 
by the OPAL collaboration from LEP II data 0. 

We simulated the process e + e~ — > W + W~ — > hadrons at CM en- 
ergy of 180 GeV using PYTHIA 6.2 [Hj. We reconstructed each event 
to 4-jets using OJF and two binary jet algorithms: fop and Jade for 
comparison. For OJF, we chose R = 2 and employed the most primitive 
variant of OJF-based algorithm with a fixed nt r ies=10 for all events. 
The jets can be combined into two pairs (supposedly resulting from 
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decays of the W's) in three different ways. We chose the combina- 
tion with the smallest difference in invariant masses between the two 
pairs and calculated the average m of the two masses. We generated 
the probability distribution itm{iti) with the VF-boson mass M as a 
parameter. The smallest error of parameter estimation corresponding 
to the number iV ex p of experimental events (as given by Rao-Frechet- 



Cramer theorem) is <5M exp = A cxp J dm (dlnnM {m) /dM) 1 
can use this number directly to evaluate the jet algorithms. 



. We 



5 Results 

The statistical error <5M exp of the VF-boson mass corresponding to 
1000 experimental events is displayed in the table below: 



ALGORITHM 


5M cxp ± 3 MeV 


OJD/OJF 


106 




105 


JADE 


118 



(The error of 3 MeV in our results is dominated by the uncertainties 
in the numerical differentiation with respect to M.) Within the ob- 
tained precision Durham and OJF are equivalent with respect to the 
accuracy, JADE appears to be worse. 

An important aspect is the speed of the algorithms. The aver- 
age processing time per event depends on the number of particles 
or detector cells in the input iV par t- We observed the following em- 
pirical relations (time in seconds): 1.2 x 10~ 8 x Ap art for ki and 
1.0 x 10~ 4 x Afp ar t x n tr i es for OJF. Apart varied from 50 to 170 in 
our sample, with the mean value of 83. However, the behavior was 
verified for Ap art up to 1700 by splitting each particle into 10 collinear 
fragments (similarly to how a particle may hit several detector cells). 

We observe that OJF is slower for small number of particles or 
detector cells whereas for a large number of particles it appears to be 
relatively much faster. In the process we studied it starts to be more 
efficient for A par t ~ 90^/n tr ies- 

It may be a strong advantage. For instance PP, in the CDF or DO 
data analysis, where binary A?r algorithm is commonly used, it is not 
possible to analyze data directly from the calorimeter cells or even 
towers because it would take forever. The preclustering procedure 
(defined separately from the jet algorithm) is necessary to reduce input 
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data to approximately 200 preclusters. With OJF, it is possible to test 
how the preclustering step affects the results or even skip it altogether. 



6 Summary 

We performed a Monte Carlo test of the Optimal Jet Definition. We 
found that in the process we studied it gives the same accuracy as the 
best algorithm applied previously to the similar analysis. OJD offers 
new options yet to be explored, e.g. the weighting of events (according 
to the value of O) to enhance the precision. We found that the already 
available implementation of OJD is very time efficient for analyses at 
the level of calorimeter cells. 
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